Counting and Sampling Triangles from a Graph Stream
نویسندگان
چکیده
This paper presents a new space-efficient algorithm for counting and sampling triangles—and more generally, constant-sized cliques—in a massive graph whose edges arrive as a stream. Compared to prior work, our algorithm yields significant improvements in the space and time complexity for these fundamental problems. Our algorithm is simple to implement and has very good practical performance on large graphs.
منابع مشابه
A Hybrid Sampling Scheme for Triangle Counting
We study the problem of estimating the number of triangles in a graph stream. No streaming algorithm can get sublinear space on all graphs, so methods in this area bound the space in terms of parameters of the input graph such as the maximum number of triangles sharing a single edge. We give a sampling algorithm that is additionally parameterized by the maximum number of triangles sharing a sin...
متن کاملDiSLR: Distributed Sampling with Limited Redundancy For Triangle Counting in Graph Streams
Given a web-scale graph that grows over time, how should its edges be stored and processed on multiple machines for rapid and accurate estimation of the count of triangles? e count of triangles (i.e., cliques of size three) has proven useful in many applications, including anomaly detection, community detection, and link recommendation. For triangle counting in large and dynamic graphs, recent...
متن کاملTowards Tighter Space Bounds for Counting Triangles and Other Substructures in Graph Streams
We revisit the much-studied problem of space-efficiently estimating the number of triangles in a graph stream, and extensions of this problem to counting fixed-sized cliques and cycles. For the important special case of counting triangles, we give a 4-pass, (1± ε)-approximate, randomized algorithm using Õ(ε−2 m3/2/T ) space, where m is the number of edges and T is a promised lower bound on the ...
متن کاملEfficient Algorithms for Approximate Triangle Counting
Counting the number of triangles in a graph has many important applications in network analysis. Several frequently computed metrics like the clustering coefficient and the transitivity ratio need to count the number of triangles in the network. Furthermore, triangles are one of the most important graph classes considered in network mining. In this paper, we present a new randomized algorithm f...
متن کاملApproximately Counting Triangles in Large Graph Streams Including Edge Duplicates with a Fixed Memory Usage
Counting triangles in a large graph is important for detecting network anomalies such as spam web pages and suspicious accounts (e.g., fraudsters and advertisers) on online social networks. However, it is challenging to compute the number of triangles in a large graph represented as a stream of edges with a low computational cost when given a limited memory. Recently, several effective sampling...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- PVLDB
دوره 6 شماره
صفحات -
تاریخ انتشار 2013